Automatic measurement of voice onset time using discriminative structured prediction.

نویسندگان

Morgan Sonderegger

Joseph Keshet

چکیده

A discriminative large-margin algorithm for automatic measurement of voice onset time (VOT) is described, considered as a case of predicting structured output from speech. Manually labeled data are used to train a function that takes as input a speech segment of an arbitrary length containing a voiceless stop, and outputs its VOT. The function is explicitly trained to minimize the difference between predicted and manually measured VOT; it operates on a set of acoustic feature functions designed based on spectral and temporal cues used by human VOT annotators. The algorithm is applied to initial voiceless stops from four corpora, representing different types of speech. Using several evaluation methods, the algorithm's performance is near human intertranscriber reliability, and compares favorably with previous work. Furthermore, the algorithm's performance is minimally affected by training and testing on different corpora, and remains essentially constant as the amount of training data is reduced to 50-250 manually labeled examples, demonstrating the method's practical applicability to new datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic discriminative measurement of voice onset time

We describe a discriminative algorithm for automatic VOT measurement, considered as an application of predicting structured output from speech. In contrast to previous studies which use customized rules, in our approach a function is trained on manually labeled examples, using an online algorithm to predict the burst and voicing onsets (and hence VOT). The feature set used is customized for det...

متن کامل

Automatic Measurement of Positive and Negative Voice Onset Time

Previous work on automatic VOT measurement has focused on positive-valued VOT. However, in many languages VOT can be either positive or negative (“prevoiced”). We present a discriminative algorithm that simultaneously decides whether a stop is prevoiced and measures its VOT. The algorithm operates on feature functions designed to locate the burst and voicing onsets in the positive and negative ...

متن کامل

Scaling up Structured Multi-Label Prediction using Discriminative Mean Field Networks

Multi-label classification is an important task in many modern machine learning applications. Accurate methods model the correlations and relationships between labels, either by assuming a low-dimensional embedding of the labels or a graph structure of label dependencies. While such interactions can be achieved using feed-forward predictors, problems with tight coupling between labels are bette...

متن کامل

Automatic Measurement of Voice Onset Time and Prevoicing Using Recurrent Neural Networks

Voice onset time (VOT) is defined as the time difference between the onset of the burst and the onset of voicing. When voicing begins preceding the burst, the stop is called prevoiced, and the VOT is negative. When voicing begins following the burst the VOT is positive. While most of the work on automatic measurement of VOT has focused on positive VOT mostly evident in American English, in many...

متن کامل

A Decade of Discriminative Language Modeling for Automatic Speech Recognition

This paper summarizes the research on discriminative language modeling focusing on its application to automatic speech recognition (ASR). A discriminative language model (DLM) is typically a linear or log-linear model consisting of a weight vector associated with a feature vector representation of a sentence. This flexible representation can include linguistically and statistically motivated fe...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

The Journal of the Acoustical Society of America

دوره 132 6 شماره

صفحات -

تاریخ انتشار 2012

Automatic measurement of voice onset time using discriminative structured prediction.

نویسندگان

چکیده

منابع مشابه

Automatic discriminative measurement of voice onset time

Automatic Measurement of Positive and Negative Voice Onset Time

Scaling up Structured Multi-Label Prediction using Discriminative Mean Field Networks

Automatic Measurement of Voice Onset Time and Prevoicing Using Recurrent Neural Networks

A Decade of Discriminative Language Modeling for Automatic Speech Recognition

عنوان ژورنال:

اشتراک گذاری